Overview

Dataset statistics

Number of variables39
Number of observations259211
Missing cells0
Missing cells (%)0.0%
Duplicate rows9673
Duplicate rows (%)3.7%
Total size in memory186.0 MiB
Average record size in memory752.6 B

Variable types

Numeric8
Categorical31

Alerts

Dataset has 9673 (3.7%) duplicate rowsDuplicates
count_floors_pre_eq is highly overall correlated with height_percentage and 1 other fieldsHigh correlation
foundation_type is highly overall correlated with has_superstructure_cement_mortar_brick and 4 other fieldsHigh correlation
ground_floor_type is highly overall correlated with has_superstructure_cement_mortar_brick and 1 other fieldsHigh correlation
has_secondary_use is highly overall correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly overall correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly overall correlated with has_secondary_useHigh correlation
has_superstructure_cement_mortar_brick is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_mud_mortar_stone is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_rc_engineered is highly overall correlated with foundation_typeHigh correlation
has_superstructure_rc_non_engineered is highly overall correlated with foundation_typeHigh correlation
height_percentage is highly overall correlated with count_floors_pre_eqHigh correlation
other_floor_type is highly overall correlated with count_floors_pre_eq and 1 other fieldsHigh correlation
roof_type is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
land_surface_condition is highly imbalanced (51.3%)Imbalance
foundation_type is highly imbalanced (60.9%)Imbalance
ground_floor_type is highly imbalanced (59.2%)Imbalance
position is highly imbalanced (50.4%)Imbalance
plan_configuration is highly imbalanced (90.7%)Imbalance
has_superstructure_adobe_mud is highly imbalanced (56.9%)Imbalance
has_superstructure_stone_flag is highly imbalanced (78.5%)Imbalance
has_superstructure_cement_mortar_stone is highly imbalanced (86.9%)Imbalance
has_superstructure_mud_mortar_brick is highly imbalanced (64.2%)Imbalance
has_superstructure_cement_mortar_brick is highly imbalanced (61.4%)Imbalance
has_superstructure_bamboo is highly imbalanced (58.1%)Imbalance
has_superstructure_rc_non_engineered is highly imbalanced (74.6%)Imbalance
has_superstructure_rc_engineered is highly imbalanced (88.2%)Imbalance
has_superstructure_other is highly imbalanced (88.8%)Imbalance
legal_ownership_status is highly imbalanced (86.0%)Imbalance
has_secondary_use_agriculture is highly imbalanced (65.5%)Imbalance
has_secondary_use_hotel is highly imbalanced (78.8%)Imbalance
has_secondary_use_rental is highly imbalanced (93.2%)Imbalance
has_secondary_use_institution is highly imbalanced (98.9%)Imbalance
has_secondary_use_school is highly imbalanced (99.5%)Imbalance
has_secondary_use_industry is highly imbalanced (98.8%)Imbalance
has_secondary_use_health_post is highly imbalanced (99.7%)Imbalance
has_secondary_use_gov_office is highly imbalanced (99.8%)Imbalance
has_secondary_use_use_police is highly imbalanced (99.9%)Imbalance
has_secondary_use_other is highly imbalanced (95.4%)Imbalance
geo_level_1_id has 3991 (1.5%) zerosZeros
age has 26041 (10.0%) zerosZeros
count_families has 20679 (8.0%) zerosZeros

Reproduction

Analysis started2024-04-22 12:57:04.935923
Analysis finished2024-04-22 12:58:37.264842
Duration1 minute and 32.33 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

geo_level_1_id
Real number (ℝ)

ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.895664
Minimum0
Maximum30
Zeros3991
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:37.557653image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.0291215
Coefficient of variation (CV)0.57781488
Kurtosis-1.2112102
Mean13.895664
Median Absolute Deviation (MAD)6
Skewness0.27298151
Sum3601909
Variance64.466792
MonotonicityNot monotonic
2024-04-22T18:28:37.855023image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 24136
 
9.3%
26 22464
 
8.7%
10 22015
 
8.5%
17 21748
 
8.4%
8 19011
 
7.3%
7 18923
 
7.3%
20 17214
 
6.6%
21 14844
 
5.7%
4 14545
 
5.6%
27 12419
 
4.8%
Other values (21) 71892
27.7%
ValueCountFrequency (%)
0 3991
 
1.5%
1 2701
 
1.0%
2 882
 
0.3%
3 7539
 
2.9%
4 14545
5.6%
5 2633
 
1.0%
6 24136
9.3%
7 18923
7.3%
8 19011
7.3%
9 3958
 
1.5%
ValueCountFrequency (%)
30 2674
 
1.0%
29 396
 
0.2%
28 265
 
0.1%
27 12419
4.8%
26 22464
8.7%
25 5548
 
2.1%
24 1239
 
0.5%
23 1121
 
0.4%
22 6203
 
2.4%
21 14844
5.7%

geo_level_2_id
Real number (ℝ)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean700.90838
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:38.122305image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.87935
Coefficient of variation (CV)0.58906322
Kurtosis-1.1889895
Mean700.90838
Median Absolute Deviation (MAD)349
Skewness0.029430122
Sum1.8168316 × 108
Variance170469.36
MonotonicityNot monotonic
2024-04-22T18:28:38.394921image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 3992
 
1.5%
158 2520
 
1.0%
1387 2040
 
0.8%
181 2001
 
0.8%
157 1896
 
0.7%
363 1760
 
0.7%
463 1740
 
0.7%
673 1704
 
0.7%
533 1677
 
0.6%
883 1626
 
0.6%
Other values (1404) 238255
91.9%
ValueCountFrequency (%)
0 38
 
< 0.1%
1 204
0.1%
3 77
 
< 0.1%
4 315
0.1%
5 25
 
< 0.1%
6 2
 
< 0.1%
7 100
 
< 0.1%
8 120
 
< 0.1%
9 333
0.1%
10 354
0.1%
ValueCountFrequency (%)
1427 6
 
< 0.1%
1426 255
0.1%
1425 466
0.2%
1424 7
 
< 0.1%
1423 3
 
< 0.1%
1422 216
0.1%
1421 254
0.1%
1420 10
 
< 0.1%
1419 94
 
< 0.1%
1418 152
 
0.1%

geo_level_3_id
Real number (ℝ)

Distinct11581
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6259.1858
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:38.697675image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile609
Q13073
median6272
Q39412
95-th percentile11928
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3647.1488
Coefficient of variation (CV)0.58268741
Kurtosis-1.2135858
Mean6259.1858
Median Absolute Deviation (MAD)3171
Skewness-0.00013750267
Sum1.6224498 × 109
Variance13301694
MonotonicityNot monotonic
2024-04-22T18:28:38.985494image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
633 651
 
0.3%
9133 647
 
0.2%
621 530
 
0.2%
11246 470
 
0.2%
2005 466
 
0.2%
11440 455
 
0.2%
7723 442
 
0.2%
9229 381
 
0.1%
2452 349
 
0.1%
12258 312
 
0.1%
Other values (11571) 254508
98.2%
ValueCountFrequency (%)
0 2
 
< 0.1%
1 6
 
< 0.1%
3 9
 
< 0.1%
5 14
 
< 0.1%
6 21
 
< 0.1%
7 2
 
< 0.1%
8 31
< 0.1%
9 3
 
< 0.1%
10 1
 
< 0.1%
11 62
< 0.1%
ValueCountFrequency (%)
12567 1
 
< 0.1%
12565 7
 
< 0.1%
12564 6
 
< 0.1%
12563 24
< 0.1%
12562 3
 
< 0.1%
12561 19
< 0.1%
12560 17
 
< 0.1%
12559 6
 
< 0.1%
12558 6
 
< 0.1%
12557 44
< 0.1%

count_floors_pre_eq
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1293078
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:39.245986image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72730734
Coefficient of variation (CV)0.34156985
Kurtosis2.3100976
Mean2.1293078
Median Absolute Deviation (MAD)0
Skewness0.8303379
Sum551940
Variance0.52897597
MonotonicityNot monotonic
2024-04-22T18:28:39.455688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 155754
60.1%
3 55332
 
21.3%
1 40272
 
15.5%
4 5390
 
2.1%
5 2218
 
0.9%
6 204
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 40272
 
15.5%
2 155754
60.1%
3 55332
 
21.3%
4 5390
 
2.1%
5 2218
 
0.9%
6 204
 
0.1%
7 39
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
8 1
 
< 0.1%
7 39
 
< 0.1%
6 204
 
0.1%
5 2218
 
0.9%
4 5390
 
2.1%
3 55332
 
21.3%
2 155754
60.1%
1 40272
 
15.5%

age
Real number (ℝ)

ZEROS 

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.341706
Minimum0
Maximum200
Zeros26041
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:39.745100image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum200
Range200
Interquartile range (IQR)20

Descriptive statistics

Standard deviation19.606818
Coefficient of variation (CV)0.918709
Kurtosis7.1486475
Mean21.341706
Median Absolute Deviation (MAD)10
Skewness2.0455646
Sum5532005
Variance384.4273
MonotonicityNot monotonic
2024-04-22T18:28:40.014225image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
10 38896
15.0%
15 36010
13.9%
5 33697
13.0%
20 32182
12.4%
0 26041
10.0%
25 24366
9.4%
30 18028
7.0%
35 10710
 
4.1%
40 10559
 
4.1%
50 7257
 
2.8%
Other values (31) 21465
8.3%
ValueCountFrequency (%)
0 26041
10.0%
5 33697
13.0%
10 38896
15.0%
15 36010
13.9%
20 32182
12.4%
25 24366
9.4%
30 18028
7.0%
35 10710
 
4.1%
40 10559
 
4.1%
45 4711
 
1.8%
ValueCountFrequency (%)
200 106
< 0.1%
195 2
 
< 0.1%
190 3
 
< 0.1%
185 1
 
< 0.1%
180 7
 
< 0.1%
175 5
 
< 0.1%
170 6
 
< 0.1%
165 2
 
< 0.1%
160 6
 
< 0.1%
155 1
 
< 0.1%

area_percentage
Real number (ℝ)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.0170325
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:40.302099image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.3938038
Coefficient of variation (CV)0.54805863
Kurtosis30.526743
Mean8.0170325
Median Absolute Deviation (MAD)2
Skewness3.5317558
Sum2078103
Variance19.305512
MonotonicityNot monotonic
2024-04-22T18:28:40.593639image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 41809
16.1%
7 36586
14.1%
5 32562
12.6%
8 28248
10.9%
9 22035
8.5%
4 19127
7.4%
10 15528
 
6.0%
11 13840
 
5.3%
3 11800
 
4.6%
12 7547
 
2.9%
Other values (74) 30129
11.6%
ValueCountFrequency (%)
1 90
 
< 0.1%
2 3168
 
1.2%
3 11800
 
4.6%
4 19127
7.4%
5 32562
12.6%
6 41809
16.1%
7 36586
14.1%
8 28248
10.9%
9 22035
8.5%
10 15528
 
6.0%
ValueCountFrequency (%)
100 1
 
< 0.1%
96 3
< 0.1%
90 1
 
< 0.1%
86 5
< 0.1%
85 4
< 0.1%
84 3
< 0.1%
83 3
< 0.1%
82 1
 
< 0.1%
80 1
 
< 0.1%
78 1
 
< 0.1%

height_percentage
Real number (ℝ)

HIGH CORRELATION 

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4333728
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:40.829840image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9179276
Coefficient of variation (CV)0.35299025
Kurtosis14.389082
Mean5.4333728
Median Absolute Deviation (MAD)1
Skewness1.8120859
Sum1408390
Variance3.6784463
MonotonicityNot monotonic
2024-04-22T18:28:41.047334image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
5 78094
30.1%
6 46287
17.9%
4 37551
14.5%
7 35225
13.6%
3 25851
 
10.0%
8 13826
 
5.3%
2 9257
 
3.6%
9 5334
 
2.1%
10 4465
 
1.7%
12 905
 
0.3%
Other values (17) 2416
 
0.9%
ValueCountFrequency (%)
2 9257
 
3.6%
3 25851
 
10.0%
4 37551
14.5%
5 78094
30.1%
6 46287
17.9%
7 35225
13.6%
8 13826
 
5.3%
9 5334
 
2.1%
10 4465
 
1.7%
11 903
 
0.3%
ValueCountFrequency (%)
32 75
< 0.1%
31 1
 
< 0.1%
28 2
 
< 0.1%
26 2
 
< 0.1%
25 3
 
< 0.1%
24 4
 
< 0.1%
23 11
 
< 0.1%
21 13
 
< 0.1%
20 33
< 0.1%
19 7
 
< 0.1%

land_surface_condition
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
t
215522 
n
35389 
o
 
8300

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rowo
3rd rowt
4th rowt
5th rowt

Common Values

ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

Length

2024-04-22T18:28:41.266419image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:41.527790image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

Most occurring characters

ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 215522
83.1%
n 35389
 
13.7%
o 8300
 
3.2%

foundation_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
r
217932 
w
 
15108
u
 
14165
i
 
10558
h
 
1448

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowr
2nd rowr
3rd rowr
4th rowr
5th rowr

Common Values

ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

Length

2024-04-22T18:28:42.024437image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:42.249119image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

Most occurring characters

ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 217932
84.1%
w 15108
 
5.8%
u 14165
 
5.5%
i 10558
 
4.1%
h 1448
 
0.6%

roof_type
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
n
181762 
q
61333 
x
 
16116

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rown
2nd rown
3rd rown
4th rown
5th rown

Common Values

ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

Length

2024-04-22T18:28:42.515012image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:42.699924image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

Most occurring characters

ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 181762
70.1%
q 61333
 
23.7%
x 16116
 
6.2%

ground_floor_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
f
208421 
x
24809 
v
24476 
z
 
998
m
 
507

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowx
3rd rowf
4th rowf
5th rowf

Common Values

ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

Length

2024-04-22T18:28:42.895200image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:43.098238image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

Most occurring characters

ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 208421
80.4%
x 24809
 
9.6%
v 24476
 
9.4%
z 998
 
0.4%
m 507
 
0.2%

other_floor_type
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
q
164311 
x
43242 
j
39681 
s
 
11977

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowq
2nd rowq
3rd rowx
4th rowx
5th rowx

Common Values

ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

Length

2024-04-22T18:28:43.318704image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:43.530799image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

Most occurring characters

ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
q 164311
63.4%
x 43242
 
16.7%
j 39681
 
15.3%
s 11977
 
4.6%

position
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
s
201059 
t
42681 
j
 
13186
o
 
2285

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rows
3rd rowt
4th rows
5th rows

Common Values

ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

Length

2024-04-22T18:28:43.738007image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:43.938440image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

Most occurring characters

ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 201059
77.6%
t 42681
 
16.5%
j 13186
 
5.1%
o 2285
 
0.9%

plan_configuration
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
d
248746 
q
 
5656
u
 
3632
s
 
344
c
 
323
Other values (5)
 
510

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowd
3rd rowd
4th rowd
5th rowd

Common Values

ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Length

2024-04-22T18:28:44.138693image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:44.354977image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 248746
96.0%
q 5656
 
2.2%
u 3632
 
1.4%
s 344
 
0.1%
c 323
 
0.1%
a 247
 
0.1%
o 159
 
0.1%
m 44
 
< 0.1%
n 38
 
< 0.1%
f 22
 
< 0.1%

has_superstructure_adobe_mud
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
236304 
1
 
22907

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

Length

2024-04-22T18:28:44.595701image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:44.780660image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

Most occurring characters

ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 236304
91.2%
1 22907
 
8.8%

has_superstructure_mud_mortar_stone
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
1
197524 
0
61687 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

Length

2024-04-22T18:28:44.969074image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:45.145371image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

Most occurring characters

ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 197524
76.2%
0 61687
 
23.8%

has_superstructure_stone_flag
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
250332 
1
 
8879

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%

Length

2024-04-22T18:28:45.346243image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:45.530757image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 250332
96.6%
1 8879
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
254481 
1
 
4730

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%

Length

2024-04-22T18:28:45.723585image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:45.916320image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 254481
98.2%
1 4730
 
1.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
241582 
1
 
17629

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

Length

2024-04-22T18:28:46.117136image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:46.286385image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

Most occurring characters

ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 241582
93.2%
1 17629
 
6.8%

has_superstructure_cement_mortar_brick
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
239677 
1
 
19534

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%

Length

2024-04-22T18:28:46.478431image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:46.662670image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%

Most occurring characters

ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 239677
92.5%
1 19534
 
7.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
193242 
1
65969 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

Length

2024-04-22T18:28:46.854912image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:47.039073image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

Most occurring characters

ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 193242
74.6%
1 65969
 
25.4%

has_superstructure_bamboo
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
237201 
1
 
22010

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

Length

2024-04-22T18:28:47.234616image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:47.397071image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

Most occurring characters

ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 237201
91.5%
1 22010
 
8.5%

has_superstructure_rc_non_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
248157 
1
 
11054

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

Length

2024-04-22T18:28:47.598953image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:47.784336image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

Most occurring characters

ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 248157
95.7%
1 11054
 
4.3%

has_superstructure_rc_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
255098 
1
 
4113

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

Length

2024-04-22T18:28:47.981370image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:48.161309image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 255098
98.4%
1 4113
 
1.6%

has_superstructure_other
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
255328 
1
 
3883

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

Length

2024-04-22T18:28:48.353330image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:48.530077image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 255328
98.5%
1 3883
 
1.5%

legal_ownership_status
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
v
249591 
a
 
5489
w
 
2664
r
 
1467

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowv
2nd rowv
3rd rowv
4th rowv
5th rowv

Common Values

ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

Length

2024-04-22T18:28:48.730881image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:48.915520image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

Most occurring characters

ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259211
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 259211
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
v 249591
96.3%
a 5489
 
2.1%
w 2664
 
1.0%
r 1467
 
0.6%

count_families
Real number (ℝ)

ZEROS 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.98415962
Minimum0
Maximum9
Zeros20679
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size12.0 MiB
2024-04-22T18:28:49.107639image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.41777267
Coefficient of variation (CV)0.42449686
Kurtosis17.643848
Mean0.98415962
Median Absolute Deviation (MAD)0
Skewness1.630183
Sum255105
Variance0.174534
MonotonicityNot monotonic
2024-04-22T18:28:49.299989image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 224987
86.8%
0 20679
 
8.0%
2 11230
 
4.3%
3 1794
 
0.7%
4 386
 
0.1%
5 101
 
< 0.1%
6 21
 
< 0.1%
7 7
 
< 0.1%
9 4
 
< 0.1%
8 2
 
< 0.1%
ValueCountFrequency (%)
0 20679
 
8.0%
1 224987
86.8%
2 11230
 
4.3%
3 1794
 
0.7%
4 386
 
0.1%
5 101
 
< 0.1%
6 21
 
< 0.1%
7 7
 
< 0.1%
8 2
 
< 0.1%
9 4
 
< 0.1%
ValueCountFrequency (%)
9 4
 
< 0.1%
8 2
 
< 0.1%
7 7
 
< 0.1%
6 21
 
< 0.1%
5 101
 
< 0.1%
4 386
 
0.1%
3 1794
 
0.7%
2 11230
 
4.3%
1 224987
86.8%
0 20679
 
8.0%

has_secondary_use
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
230209 
1
29002 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

Length

2024-04-22T18:28:49.508921image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:49.701615image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

Most occurring characters

ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 230209
88.8%
1 29002
 
11.2%

has_secondary_use_agriculture
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
242507 
1
 
16704

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

Length

2024-04-22T18:28:49.894173image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:50.070688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

Most occurring characters

ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 242507
93.6%
1 16704
 
6.4%

has_secondary_use_hotel
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
250501 
1
 
8710

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

Length

2024-04-22T18:28:50.262925image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:50.447083image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 250501
96.6%
1 8710
 
3.4%

has_secondary_use_rental
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
257124 
1
 
2087

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

Length

2024-04-22T18:28:50.639324image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:50.835707image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 257124
99.2%
1 2087
 
0.8%

has_secondary_use_institution
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
258967 
1
 
244

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

Length

2024-04-22T18:28:51.027975image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:51.204151image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 258967
99.9%
1 244
 
0.1%

has_secondary_use_school
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
259117 
1
 
94

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

Length

2024-04-22T18:28:51.396575image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:51.572975image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259117
> 99.9%
1 94
 
< 0.1%

has_secondary_use_industry
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
258932 
1
 
279

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

Length

2024-04-22T18:28:51.765393image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:51.949317image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 258932
99.9%
1 279
 
0.1%

has_secondary_use_health_post
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
259162 
1
 
49

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

Length

2024-04-22T18:28:52.133922image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:52.310720image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259162
> 99.9%
1 49
 
< 0.1%

has_secondary_use_gov_office
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
259173 
1
 
38

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

Length

2024-04-22T18:28:52.542560image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:52.761497image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259173
> 99.9%
1 38
 
< 0.1%

has_secondary_use_use_police
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
259188 
1
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

Length

2024-04-22T18:28:53.004429image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:53.206167image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 259188
> 99.9%
1 23
 
< 0.1%

has_secondary_use_other
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
0
257880 
1
 
1331

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

Length

2024-04-22T18:28:53.422570image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:53.631820image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 257880
99.5%
1 1331
 
0.5%

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size24.4 MiB
2
147437 
3
86829 
1
24945 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters259211
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row3
4th row2
5th row3

Common Values

ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Length

2024-04-22T18:28:53.833106image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T18:28:54.395154image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Most occurring characters

ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 259211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
Common 259211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 259211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 147437
56.9%
3 86829
33.5%
1 24945
 
9.6%

Interactions

2024-04-22T18:28:30.078121image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:15.278572image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:17.269431image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:19.568521image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:21.459178image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:23.765264image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:25.720876image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:27.974580image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:30.310480image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:15.540747image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:17.524997image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:19.798678image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:21.694359image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:24.011290image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:25.988086image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:28.230720image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:30.610456image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:15.813036image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:17.772688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:20.046200image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:21.944611image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:24.268014image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:26.251744image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:28.486299image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:30.902990image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:16.073980image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:18.212043image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:20.271344image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:22.182060image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:24.503778image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:26.497499image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:28.719909image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:31.192864image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:16.320316image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:18.458769image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:20.517547image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:22.493493image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:24.755754image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:26.773654image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:29.042618image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:31.438056image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:16.571062image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:18.695046image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:20.763262image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:22.758105image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:25.003480image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:27.027411image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:29.361007image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:31.655751image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:16.827030image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:19.039150image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:21.013538image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:23.007244image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:25.242328image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:27.515081image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:29.608039image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:31.911775image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:17.045565image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:19.345925image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:21.247882image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:23.419219image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:25.485007image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:27.737193image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-04-22T18:28:29.850410image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-04-22T18:28:54.643392image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
agearea_percentagecount_familiescount_floors_pre_eqdamage_gradefoundation_typegeo_level_1_idgeo_level_2_idgeo_level_3_idground_floor_typehas_secondary_usehas_secondary_use_agriculturehas_secondary_use_gov_officehas_secondary_use_health_posthas_secondary_use_hotelhas_secondary_use_industryhas_secondary_use_institutionhas_secondary_use_otherhas_secondary_use_rentalhas_secondary_use_schoolhas_secondary_use_use_policehas_superstructure_adobe_mudhas_superstructure_bamboohas_superstructure_cement_mortar_brickhas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_mud_mortar_stonehas_superstructure_otherhas_superstructure_rc_engineeredhas_superstructure_rc_non_engineeredhas_superstructure_stone_flaghas_superstructure_timberheight_percentageland_surface_conditionlegal_ownership_statusother_floor_typeplan_configurationpositionroof_type
age1.000-0.0180.0490.2580.1190.115-0.0630.034-0.0030.0890.0320.0200.0000.0020.0500.0080.0110.0070.0260.0080.0000.2070.0460.1240.0370.2850.1420.0140.0840.0740.0300.0400.1990.0290.0480.1140.0140.1920.091
area_percentage-0.0181.0000.0780.1250.1010.1660.038-0.0240.0000.1640.1170.0190.0360.0150.1570.0190.0570.0150.1050.0770.0000.0360.0300.2120.0750.0640.2380.0120.2230.1900.0040.0570.2100.0190.0120.1870.0470.0470.251
count_families0.0490.0781.0000.0790.0610.0550.037-0.012-0.0020.0470.1140.0510.0170.0120.0800.0290.0330.0190.0940.0290.0190.0340.0300.0490.0110.0350.0630.0060.0660.0600.0090.0350.0630.0140.0110.0710.0080.0330.080
count_floors_pre_eq0.2580.1250.0791.0000.1540.144-0.0880.044-0.0150.1230.0740.0550.0110.0080.1400.0220.0260.0100.0660.0250.0030.2040.0790.2590.0300.3920.3580.0330.1320.1070.0500.0980.7560.0470.0670.5820.0360.3140.182
damage_grade0.1190.1010.0610.1541.0000.306-0.0350.0410.0090.2650.0800.0470.0100.0080.1080.0130.0330.0160.1020.0140.0000.0750.0640.2810.0610.0630.3360.0330.2380.1880.0670.0700.0810.0290.0710.2460.0580.0450.241
foundation_type0.1150.1660.0550.1440.3061.0000.164-0.0330.0070.3560.1730.0550.0260.0190.2560.0220.0680.0150.1860.0380.0080.1050.3040.5100.2030.0760.5540.1150.5440.5060.1450.342-0.1350.0320.1500.4110.0570.0980.548
geo_level_1_id-0.0630.0380.037-0.088-0.0350.1641.000-0.0670.0040.1100.0980.1260.0060.0040.0290.0100.0060.0440.0460.0070.0040.2690.2370.2150.0660.2850.3320.0600.0620.0450.1130.207-0.0760.0330.0890.1290.0300.1360.206
geo_level_2_id0.034-0.024-0.0120.0440.041-0.033-0.0671.000-0.0000.0690.0160.0360.0030.0000.0290.0070.0080.0300.0430.0040.0060.0770.0920.1240.0260.0990.1500.0410.0680.0630.0630.0900.0380.0490.0640.0700.0180.0780.087
geo_level_3_id-0.0030.000-0.002-0.0150.0090.0070.004-0.0001.0000.0260.0150.0230.0020.0030.0120.0050.0080.0150.0160.0060.0000.0380.0370.0340.0200.0550.0350.0270.0360.0310.0360.050-0.0180.0330.0340.0290.0080.0260.035
ground_floor_type0.0890.1640.0470.1230.2650.3560.1100.0690.0261.0000.1540.0680.0200.0180.2490.0250.0600.0400.1620.0370.0060.0820.0820.5860.1530.0510.5000.0280.3630.3640.1310.1020.0000.0450.0320.3640.0590.0800.472
has_secondary_use0.0320.1170.1140.0740.0800.1730.0980.0160.0150.1541.0000.7390.0340.0380.5250.0920.0860.2020.2540.0530.0260.0130.0220.0760.0420.0110.0870.0000.1040.1090.0000.0230.0550.0090.0250.1820.0300.1180.160
has_secondary_use_agriculture0.0200.0190.0510.0550.0470.0550.1260.0360.0230.0680.7391.0000.0020.0020.0490.0080.0080.0850.0230.0040.0000.0030.0050.0540.0160.0380.0580.0060.0290.0230.0110.002-0.0040.0060.0120.0650.0170.0340.060
has_secondary_use_gov_office0.0000.0360.0170.0110.0100.0260.0060.0030.0020.0200.0340.0021.0000.0000.0000.0000.0000.0000.0000.0000.0000.0010.0000.0070.0110.0020.0110.0000.0300.0040.0010.0030.0100.0000.0000.0260.0000.0080.022
has_secondary_use_health_post0.0020.0150.0120.0080.0080.0190.0040.0000.0030.0180.0380.0020.0001.0000.0000.0000.0000.0000.0000.0000.0000.0020.0030.0080.0030.0000.0080.0000.0100.0070.0000.0040.0070.0000.0000.0190.0000.0100.017
has_secondary_use_hotel0.0500.1570.0800.1400.1080.2560.0290.0290.0120.2490.5250.0490.0000.0001.0000.0050.0050.0030.0170.0020.0000.0130.0310.1390.0720.0240.1590.0070.1400.1580.0090.0280.0750.0130.0430.2620.0490.1990.233
has_secondary_use_industry0.0080.0190.0290.0220.0130.0220.0100.0070.0050.0250.0920.0080.0000.0000.0051.0000.0000.0030.0010.0000.0000.0000.0020.0260.0060.0110.0250.0000.0040.0150.0030.000-0.0010.0000.0040.0230.0050.0120.016
has_secondary_use_institution0.0110.0570.0330.0260.0330.0680.0060.0080.0080.0600.0860.0080.0000.0000.0050.0001.0000.0030.0010.0000.0000.0040.0040.0320.0070.0010.0360.0010.0500.0360.0000.0050.0200.0040.0000.0750.0120.0150.065
has_secondary_use_other0.0070.0150.0190.0100.0160.0150.0440.0300.0150.0400.2020.0850.0000.0000.0030.0030.0031.0000.0010.0000.0000.0100.0080.0000.0140.0040.0050.0060.0080.0000.0010.0140.0010.0170.0160.0210.0000.0080.010
has_secondary_use_rental0.0260.1050.0940.0660.1020.1860.0460.0430.0160.1620.2540.0230.0000.0000.0170.0010.0010.0011.0000.0000.0000.0040.0190.1100.0340.0170.1170.0000.1300.1030.0110.0260.0460.0080.0050.1960.0270.0600.188
has_secondary_use_school0.0080.0770.0290.0250.0140.0380.0070.0040.0060.0370.0530.0040.0000.0000.0020.0000.0000.0000.0001.0000.0000.0000.0030.0190.0050.0000.0230.0000.0240.0200.0000.0030.0100.0040.0080.0440.0320.0070.032
has_secondary_use_use_police0.0000.0000.0190.0030.0000.0080.0040.0060.0000.0060.0260.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0010.0040.0030.0000.0020.0000.0030.0000.0000.0010.0040.0020.0000.0050.0120.0000.008
has_superstructure_adobe_mud0.2070.0360.0340.2040.0750.1050.2690.0770.0380.0820.0130.0030.0010.0020.0130.0000.0040.0100.0040.0000.0001.0000.0120.0370.0140.3150.3060.0580.0370.0370.0070.0130.1540.0200.0510.0900.0270.1940.073
has_superstructure_bamboo0.0460.0300.0300.0790.0640.3040.2370.0920.0370.0820.0220.0050.0000.0030.0310.0020.0040.0080.0190.0030.0010.0121.0000.0550.0030.0000.0560.1170.0370.0200.0790.438-0.0590.0320.0880.0700.0240.0550.095
has_superstructure_cement_mortar_brick0.1240.2120.0490.2590.2810.5100.2150.1240.0340.5860.0760.0540.0070.0080.1390.0260.0320.0000.1100.0190.0040.0370.0551.0000.0790.0310.4710.0060.1210.1390.0440.058-0.0480.0600.0780.4420.1050.1190.420
has_superstructure_cement_mortar_stone0.0370.0750.0110.0300.0610.2030.0660.0260.0200.1530.0420.0160.0110.0030.0720.0060.0070.0140.0340.0050.0030.0140.0030.0791.0000.0000.1040.0120.0250.0760.0360.0150.0050.0130.0110.0970.0260.0290.084
has_superstructure_mud_mortar_brick0.2850.0640.0350.3920.0630.0760.2850.0990.0550.0510.0110.0380.0020.0000.0240.0110.0010.0040.0170.0000.0000.3150.0000.0310.0001.0000.3760.0270.0260.0290.0330.0000.1790.0640.0340.0380.0440.3490.036
has_superstructure_mud_mortar_stone0.1420.2380.0630.3580.3360.5540.3320.1500.0350.5000.0870.0580.0110.0080.1590.0250.0360.0050.1170.0230.0020.3060.0560.4710.1040.3761.0000.0420.2250.2230.0330.042-0.0370.0800.1480.4520.1200.2820.439
has_superstructure_other0.0140.0120.0060.0330.0330.1150.0600.0410.0270.0280.0000.0060.0000.0000.0070.0000.0010.0060.0000.0000.0000.0580.1170.0060.0120.0270.0421.0000.0090.0180.0660.105-0.0170.0350.0200.0370.0190.0000.020
has_superstructure_rc_engineered0.0840.2230.0660.1320.2380.5440.0620.0680.0360.3630.1040.0290.0300.0100.1400.0040.0500.0080.1300.0240.0030.0370.0370.1210.0250.0260.2250.0091.0000.0120.0210.0690.0700.0280.0130.4200.0420.0950.467
has_superstructure_rc_non_engineered0.0740.1900.0600.1070.1880.5060.0450.0630.0310.3640.1090.0230.0040.0070.1580.0150.0360.0000.1030.0200.0000.0370.0200.1390.0760.0290.2230.0180.0121.0000.0080.0280.0440.0110.0080.3890.0480.0890.447
has_superstructure_stone_flag0.0300.0040.0090.0500.0670.1450.1130.0630.0360.1310.0000.0110.0010.0000.0090.0030.0000.0010.0110.0000.0000.0070.0790.0440.0360.0330.0330.0660.0210.0081.0000.125-0.0170.0460.0120.1300.0160.0200.043
has_superstructure_timber0.0400.0570.0350.0980.0700.3420.2070.0900.0500.1020.0230.0020.0030.0040.0280.0000.0050.0140.0260.0030.0010.0130.4380.0580.0150.0000.0420.1050.0690.0280.1251.000-0.0390.0470.1060.1610.0280.0530.142
height_percentage0.1990.2100.0630.7560.081-0.135-0.0760.038-0.0180.0000.055-0.0040.0100.0070.075-0.0010.0200.0010.0460.0100.0040.154-0.059-0.0480.0050.179-0.037-0.0170.0700.044-0.017-0.0391.0000.0190.0380.3000.0200.2100.235
land_surface_condition0.0290.0190.0140.0470.0290.0320.0330.0490.0330.0450.0090.0060.0000.0000.0130.0000.0040.0170.0080.0040.0020.0200.0320.0600.0130.0640.0800.0350.0280.0110.0460.0470.0191.0000.0210.0370.0220.0330.039
legal_ownership_status0.0480.0120.0110.0670.0710.1500.0890.0640.0340.0320.0250.0120.0000.0000.0430.0040.0000.0160.0050.0080.0000.0510.0880.0780.0110.0340.1480.0200.0130.0080.0120.1060.0380.0211.0000.0650.0140.0290.029
other_floor_type0.1140.1870.0710.5820.2460.4110.1290.0700.0290.3640.1820.0650.0260.0190.2620.0230.0750.0210.1960.0440.0050.0900.0700.4420.0970.0380.4520.0370.4200.3890.1300.1610.3000.0370.0651.0000.0640.1120.522
plan_configuration0.0140.0470.0080.0360.0580.0570.0300.0180.0080.0590.0300.0170.0000.0000.0490.0050.0120.0000.0270.0320.0120.0270.0240.1050.0260.0440.1200.0190.0420.0480.0160.0280.0200.0220.0140.0641.0000.0270.063
position0.1920.0470.0330.3140.0450.0980.1360.0780.0260.0800.1180.0340.0080.0100.1990.0120.0150.0080.0600.0070.0000.1940.0550.1190.0290.3490.2820.0000.0950.0890.0200.0530.2100.0330.0290.1120.0271.0000.126
roof_type0.0910.2510.0800.1820.2410.5480.2060.0870.0350.4720.1600.0600.0220.0170.2330.0160.0650.0100.1880.0320.0080.0730.0950.4200.0840.0360.4390.0200.4670.4470.0430.1420.2350.0390.0290.5220.0630.1261.000

Missing values

2024-04-22T18:28:32.399640image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T18:28:34.454894image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

geo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
064871219823065trnfqtd11000000000v1000000000003
18900281221087ornxqsd01000000000v1000000000002
221363897321055trnfxtd01000000000v1000000000003
3224181069421065trnfxsd01000011000v1000000000002
411131148833089trnfxsd10000000000v1000000000003
58558608921095trnfqsd01000000000v1110000000002
694751206622534nrnxqsd01000000000v1000000000003
720323122362086twqvxsu00000110000v1000000000001
80757721921586trqfqsd01000010000v1000000000002
92688699410134tinvjsd00000100000v1000000000001
geo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
26059120368598012553nrnfjsd01000000000v1110000000003
260592101382190322555trnfqsd01000010000v1000000000002
2605938767861325135trnfqsd01000000000v1110000000002
260594271811537601312trnfxjd00001000000v1000000000002
2605958268471822085trnfqsd01000000000v1000000000003
260596251335162115563nrnfjsq01000000000v1000000000002
2605971771520602065trnfqsd01000000000v1000000000003
2605981751816335567trqfqsd01000000000v1000000000003
26059926391851210146trxvsjd00000100000v1000000000002
260600219910131076nrnfqjd01000000000v3000000000003

Duplicate rows

Most frequently occurring

geo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade# duplicates
45701013826532568trnfqsd01000000000v100000000000329
635217930898432067trnfqsd01000011000v100000000000315
9327272691112121587trnfxsd01000000000v100000000000315
726920863857821097trqxxsd01000010000v100000000000214
1098412181053121065trnfqsd01000000000v100000000000212
35271070942921054trnfqsd01000000000v100000000000212
85872639112461063tuqvjsq00000100000a100000000000112
93312726911121215107trnfxsd01000000000v100000000000312
317081114800221565nrnfqsd01000000000v100000000000311
3233720691722075trnfqsd01000000000v100000000000310